Approximate Dynamic Programming By Minimizing Distributionally Robust Bounds
نویسنده
چکیده
Approximate dynamic programming is a popular method for solving large Markov decision processes. This paper describes a new class of approximate dynamic programming (ADP) methods— distributionally robust ADP—that address the curse of dimensionality by minimizing a pessimistic bound on the policy loss. This approach turns ADP into an optimization problem, for which we derive new mathematical program formulations and analyze its properties. DRADP improves on the theoretical guarantees of existing ADP methods—it guarantees convergence and L1 norm-based error bounds. The empirical evaluation of DRADP shows that the theoretical guarantees translate well into good performance on benchmark problems.
منابع مشابه
A Practically Efficient Approach for Solving Adaptive Distributionally Robust Linear Optimization Problems
We develop a modular and tractable framework for solving an adaptive distributionally robust linear optimization problem, where we minimize the worst-case expected cost over an ambiguity set of probability distributions. The adaptive distrbutaionally robust optimization framework caters for dynamic decision making, where decisions can adapt to the uncertain outcomes as they unfold in stages. Fo...
متن کاملDistributionally Robust Logistic Regression
This paper proposes a distributionally robust approach to logistic regression. We use the Wasserstein distance to construct a ball in the space of probability distributions centered at the uniform distribution on the training samples. If the radius of this ball is chosen judiciously, we can guarantee that it contains the unknown datagenerating distribution with high confidence. We then formulat...
متن کاملDistributionally Robust Stochastic Knapsack Problem
This paper considers a distributionally robust version of a quadratic knapsack problem. In this model, a subsets of items is selected to maximizes the total profit while requiring that a set of knapsack constraints be satisfied with high probability. In contrast to the stochastic programming version of this problem, we assume that only part of information on random data is known, i.e., the firs...
متن کاملDistributionally robust stochastic shortest path problem
This paper considers a stochastic version of the shortest path problem, the Distributionally Robust Stochastic Shortest Path Problem(DRSSPP) on directed graphs. In this model, the arc costs are deterministic, while each arc has a random delay. The mean vector and the second-moment matrix of the uncertain data are assumed known, but the exact information of the distribution is unknown. A penalty...
متن کاملA Cutting Surface Algorithm for Semi-Infinite Convex Programming with an Application to Moment Robust Optimization
We first present and analyze a central cutting surface algorithm for general semi-infinite convex optimization problems, and use it to develop an algorithm for distributionally robust optimization problems in which the uncertainty set consists of probability distributions with given bounds on their moments. The cutting surface algorithm is also applicable to problems with non-differentiable sem...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1205.1782 شماره
صفحات -
تاریخ انتشار 2012